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Abstract 

Optical Coherence Tomography (OCT) is an emerging technique in the field 
of biomedical imaging, with applications in ophthalmology, dermatology, 
coronary imaging etc. OCT images usually suffer from a granular pattern, 
called speckle noise, which restricts the process of interpretation. Therefore 
the need for speckle noise reduction techniques is of high importance. To the 
best of our knowledge, use of Independent Component Analysis (ICA) tech¬ 
niques has never been explored for speckle reduction of OCT images. Here, 
a comparative study of several ICA techniques (InfoMax, JADE, FastICA 
and SOBI) is provided for noise reduction of retinal OCT images. Hav¬ 
ing multiple B-scans of the same location, the eye movements are compen¬ 
sated using a rigid registration technique. Then, different ICA techniques 
are applied to the aggregated set of B-scans for extracting the noise-free 
image. Signal-to-Noise-Ratio (SNR), Contrast-to-Noise-Ratio (CNR) and 
Equivalent-Number-of-Looks (ENL), as well as analysis on the computational 
complexity of the methods, are considered as metrics for comparison. The 
results show that use of ICA can be beneficial, especially in case of having 
fewer number of B-scans. 
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1. Introduction 


OCT is a powerful imaging system for non-invasive acquisition of 3D 
volumetric images of tissues [T], with applications in ophthalmology, der¬ 
matology, coronary imaging etc. Due to its underlying physics, which is 
common in narrow-band detection systems like Synthetic-Aperture Radar 
(SAR) and ultrasound, OCT images usually suffer from a granular pattern 
called speckle. Not only the optical properties of the system, but also the 
motion of the subject to be imaged, size and temporal coherence of the light 
source, multiple scattering, phase deviation of the beam and aperture of the 
detector can affect the speckle [2], Fig. 1 shows a sample retinal OCT image, 
highly degraded by speckle noise. 

Speckle is considered to be multiplicative noise, in contrast to the additive 
Gaussian noise. Limited dynamic range of displays requires us to compress 
the OCT signals usually by a logarithmic transform, which converts the mul¬ 
tiplicative speckle noise to additive noise [3]. Two major classes of speckle 
noise reduction techniques are: 1) methods of noise reduction during the 
acquisition time and 2) post-processing techniques. In the first class multi¬ 
ple uncorrelated recordings are averaged. This includes spatial [3], angular 
[5] , polarization [6] and frequency [7j compounding techniques. As for post¬ 
processing, anisotropic diffusion-based techniques [3]J and multi-scale/multi- 
resolution geometric representation techniques [8] are of high interest between 
scholars. Use of compressive sensing and sparse representation have also been 
explored in the past few years [5j. For a more complete review on the dif¬ 
ferent image analysis techniques in OCT image processing, including noise 
reduction, the reader is referred to ra and references therein. 

Post-processing averaging/median filtering is also an interesting method 
for speckle reduction. In such techniques, multiple B-scans of the same loca¬ 
tion are acquired and then the average/median is taken. The misalignment 
between the different B-scans is usually compensated with a parametric im¬ 
age registration technique. Theoretically, having N B-scans with uncorre¬ 
lated speckle, SNR can be improved by a factor of y/N. The works presented 
in mm can be mentioned as examples. Recently the use of sparse and 
low-rank decomposition based batch image alignment was explored by the 
authors [T3] . 

In this paper the use of Independent Component Analysis (ICA) tech¬ 
niques for speckle noise reduction of retinal OCT images is explored, which 
to the best of our knowledge has never been investigated before. Having 
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Figure 1: Sample retinal OCT image degraded by speckle noise; selected 
ROIs are shown with blue rectangles. 


multiple B-scans of the same location in retina, the eye movement is com¬ 
pensated by considering a rigid transformation between consecutive B-scans 
using ImageJ ua. Having negligible eye motion within each B-scan, the need 
for deformable registration techniques [15j [16] can be eliminated. Then, sev¬ 
eral ICA techniques are used for extracting the noise-free image from multiple 
noisy B-scans. SNR, CNR and ENL are considered as metrics for comparing 
the performance of different methods. Investigating the results reveals inter¬ 
esting facts regarding the impact of using ICA for speckle noise reduction. 
Especially in case of having a few number of images in which ICA techniques 
provide better performance in comparison to median filtering. But increasing 
the number of images causes the ICA techniques to perform poorly, since the 
main assumption in ICA is having non-Gaussian uncorrelated components 
which can’t be completely satisfied here. 

2. Independent Component Analysis (ICA) 

ICA is one of the most widely used techniques for Blind Source Separation 
(BSS) p2j. The problem of BSS consists of having interfering signals from 
multiple sources recorded and trying to find the individual source signals from 
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these mixed recordings. The very well-known example is a cocktail party, 
with multiple people talking while there are recorders in different places of the 
room. This can be mathematically modeled by considering Si(t),i = 1,..., N 
as the set of sources and Xi(t),i = 1, ...,N as the observations. Therefore: 

x(t) = As (t) (1) 

where A is the unknown mixing matrix. Recovering the original signals Sj(f) 
from the recordings Xi(t) is the objective of BSS. Therefore, estimating the 
unmixing matrix W such that W = A -1 , the sources can be estimated: 

s(t) = Wx(t) (2) 

But how to find W when you are ’’blind” to the nature of mixing matrix 
and the source signals themselves? This question results in defining a set of 
constraints on the nature of source signals. As is obvious from the name, 
ICA, the components should be statistically independent. Mathematically 

speaking, having two random variables si and s 2 , the joint probability density 
function (pdf) Pi, 2 (si,s 2 ) for independent variables can be computed as: 

PiAsi, s 2 ) = pi(s 1 )p 2 (s 2 ) (3) 

where Pi(si) and p 2 (s 2 ) are the marginal pdf’s. 

Another constraint of the signal sources is that they should be non- 
Gaussian. Based on the central limit theorem, the distribution of a sum 
of independent signals with arbitrary distributions tends to be a Gaussian 
distribution, which basically means any Gaussian distribution can be consid¬ 
ered as a sum of multiple independent signals. Therefore having Gaussian 
signals as components prevents us from being able to distinguish them prop- 
erly. 

ICA has two ambiguities. The variance and the sign of the independent 
components cannot be estimated. This is usually insignificant in most of the 
applications. The second ambiguity is regarding the order of the extracted 
components which cannot be specified. 

As for pre-processing the data before independent component analysis, 
centering the data (subtracting the mean value) is usually the first step in or¬ 
der to make the ICA procedure easier. The mean will be added after extract¬ 
ing the components. Whitening the data for having uncorrelated components 
with unit variances is next. 
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Here four well-known ICA methods are used for noise reduction of reti¬ 
nal OCT images: InfoMax (RUNICA) (18], FastICA [19], JADE [20] and 
SOBI [21] . In each one of these techniques a contrast/likelihood function is 
defined and optimized for achieving the optimal separation between sources. 
The InfoMax method works based on maximizing the output entropy (in¬ 
formation flow) of a neural network with non-linear outputs. FastICA on 
the other hand tries to maximize the non-Gaussianity using an approxima¬ 
tion for negentropy utilizing a fixed-point numerical scheme. Joint Approx¬ 
imate Diagonalisation of Eigen-matrices (JADE) method works based on 
digonalization of the sample data’s fourth-order cumulants of the whitened 
process created from the sample data and a whitening matrix. On the 
other hand, Second Order Blind Identification (SOBI) assumes that the 
sample data is gathered from a set of temporally correlated sources and 
tries to separate them by joint diagonalization of several correlation matri¬ 
ces. These techniques are widely used in different applications, like elec¬ 
troencephalogram (EEG) source separation [22] • In the following section 
brief overviews will be given for each one of the above-mentioned meth¬ 
ods. The reader is referred to and references 

therein, as well as EEGLAB [30] and ICA Central (http://perso.telecom- 
paristech.fr/~cardoso/icacentral/) for further reading regarding other ICA 
techniques and pointers for implementations. 


3. Methods 


3.1. InfoMax 

InfoMax is based on a neural network approach [18], which tries to max¬ 
imize the mutual information of the network’s output Y about its input X, 
as follows: 

I(Y,X) = H(Y)-H(Y\X) (4) 

where H(Y ) is the entropy of output Y while H(Y\X) represents the entropy 
of the output that is not a result of input. Considering only the gradient of 
information theoretic quantities with respect to some parameter w in the 
network, while knowing that H(Y\X) does not depend on w, we can have: 


d_ 

dw 


I(X,Y) 


l H « 


(5) 


Having a neural network with input vector x, weight matrix W, bias 
vector wq and monotonically transformed output vector y = g(Wx) + wq 
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with g{u ) being a sigmoidal function ( g{u ) = (1 + e “) 1 ), the learning rule 
will be of the form: 


AWoc[W T ]- 1 + (l-2y)x T (6) 

where 1 is a vector of ones. At each iteration W is updated until it reaches 
convergence. 

3.2. FastiCA 

The differential entropy H of a random vector y with density function 
f(y ) can be defined as: 

H(y) = - J f(y)logf(y)dy (7) 

It has been proved that [3TJ 02] a Gaussian variable has the largest entropy 
among all random variables of equal variance. Therefore to obtain a measure 
for non-Gaussianity, a modified definition of differential entropy is used which 
is called negentropy: 


J(y ) = H(y gauss ) - H(y) (8) 

in which H(y gauss ) is a Gaussian random variable of the same covariance 
matrix as y. This measure is always non-negative since the second term in 
subtraction is always smaller or equal to the first term. For FastICA, an 
approximation of the negentropy is used: 

J(y) oc [E{G(</)} - £{G(!,)}] 2 (9) 

v being a zero-mean and unit variance Gaussian variable and G a non¬ 
quadratic function. Two examples for proper function G are G\(u) = A log{cosh{a\u )) 
and G 2 (m) = —exp(—u 2 / 2) for 1 < cq < 2. More elaboration on these choices 
can be found in [26]. Maximization of the non-Gaussianity of w T x is done 
using a fixed-point scheme consisting of several steps, with g being the deriva¬ 
tive of G, as follows: 

1. initialization of the weight vector w; 

2. w + = E{x(g(w T x))} — E{g{w T x)}w ; 

3. w — u; + /||u; + ||; 

4. goto step 2, until convergence. 
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In this formulation, the convergence happens when the new and old w vectors 
are in the same direction which means having their dot product equal to zero. 
Of course this can happen even if they are in opposite directions too, which 
is valid since that represents one of he ambiguities of the ICA. 

3.3. SOBI 

Assuming a more general case of mixed sources with addition of noise 
n(t) we have: 

x(f) = As (t) + n(t) (10) 

where A is a full-ranked complex matrix. The main assumption is to have 
second order stationary and mutually uncorrelated sources while the additive 
noise is assumed to be spatially and temporally white and uncorrelated with 
the source signals. SOBI tries to separate the components by joint diago- 
nalization of several correlation matrices. Based on these assumptions, the 
algorithm of robust SOBI for N samples consists of several steps as follows 

m- 

1. robust orthogonalization of the sensor signals, x = Q x(t), Q being the 
orthogonalization matrix; 

2. estimate the set of covariance matrices: 

1 N 

R.r(p,:) = —^2x(t)x T (t - Pi) = QR r (p;)Q r 
v k =1 

for a set of time lags (pi,p 2 , ■■■,Pl)', 

3. performing joint approximate diagonalization: R s (pi) = UD,;U r ,Vi, 
which means estimating the orthogonal matrix U; 

4. estimating the source signals 

s(t ) = U T Q x(t) 

and the mixing matrix 

W = Q+U. 


3.4. JADE 

The JADE (Joint Approximate Diagonalization of Eigenmatrices) is a 
natural extension of the SOBI. The algorithm consists of the following steps 
0 ]: 
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1. robust pre-whitening or orthogonalization: x = Q x(t)\ 

2. performing eigen value decomposition (EVD) of the sampled contracted 
quadri-covariance matrix 


C,(I) = ^ ^{x T (t)x(t)x(t)x T (t )} - 2R*(0)R*(0) - tr(R 2 (0))Rg(0) 

V k =1 

= UA;U T 

( 11 ) 


with R*(0) = jf and U = [ui, u 2 , u n ]; 

3. estimating the n sampled contracted quadri-covariance matrices: 


Cg(Ep) = ^2{x T (t)E p x(t)x(t)x T (t)} - R 2 (0)EpRs(0) 

V k =1 

—tr (EpRg(0))R s (0) - R,E/ R, 


( 12 ) 


for E p = u p Up,p = 1,2,...,n; 

4. finding an orthogonal joint diagonalization matrix U for all n matrices; 

5. estimating the mixing matrix using W = Q + U. 

For more elaboration on the methods for joint approximate diagonaliza¬ 
tion, the reader is referred to |21Jj. 

4. Results and Discussion 

For testing the methods, multiple B-scans of the same cross-section in 
retina are acquired using a Bioptigen OCT device (Bioptigen, Inc. Durham, 
NC 27709) available at AOIP (Advanced Ocular Imaging Program, Medical 
College of Wisconsin, Milwaukee, WI). Even though the subject is asked 
to fixate, the involuntary eye movements (tremor, drifts, micro-saccades) 
cause misalignment between consecutive B-scans. Considering the effect of 
eye movements within each B-scan as negligible, a parametric registration 
technique, namely rigid registration, is suitable for compensating the motion 
between B-scans. For this, the rigid registration technique implemented in 
IrnageJ [H] is used. After aligning the B-scans, they are vectorized and 
aggregated into a data matrix and then 1CA techniques, implemented in 


MATLAB (The MathWorks, Inc, Natick, MA 01760), are applied to extract 
the independent components in the data matrix. The main component is the 
noise-free image, while the rest are only containing noise. 

For assessing the performance of the methods, several metrics are consid¬ 
ered. Considering 9 regions of interest (ROI) as depicted in Fig. 1 in the final 
result, one only containing background noise and the rest containing image 
features and homogeneous regions, the metrics can be defined as follows: 


SNR m = 20 x log w (^) 

(?b 


CNR m 


f^r 


R b 


+ at 


ENL m = 


(13) 


where fib and ay are the mean and standard deviation of the background 
noise and fi m and cr m are the mean and standard deviation of the m-th ROI 
containing image features. ENL provides a measure for smoothness of homo¬ 
geneous regions while SNR and CNR provide information on the signal and 
contrast of image features with respect to the background noise. The aver¬ 
age of these metrics are considered here for comparison. Different numbers 
of images of the human retina in the central foveal region are considered for 
assessing the performance of the proposed algorithm. 

Fig. 2 displays the improvement of SNR for different methods using 
different numbers of input B-scans. Theoretically, having N images, the SNR 
can be improved by a /N using average/median filtering. As is obvious, this is 
not the case for ICA techniques. While all of the ICA techniques outperform 
median filtering at first, for using up to 20 input B-scans, only the SOBI 
remains superior for the rest and RUNICA, JADE and FastICA perform 
poorly. As mentioned before, one of the main constraints on the input data 
for being able to use ICA is having uncorrelated non-Gaussian components. 
This can’t be satisfied, especially when using excessive number of images. 
The same pattern can be seen in Fig. 3 and Fig. 4 for the improvement 
of CNR and ENL, respectively. In case of SOBI, the assumption of having 
temporally correlated sources in the design of the algorithm makes it better 
in terms of SNR, CNR and ENL since the input images are acquired from 
the same location in retina, with the possibility of misalignment due to eye 
movement. 
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Figure 2: SNR improvement for different number of input images (5-50). 


As for the computational complexity, while JADE shows a quadratic be¬ 
havior for convergence, the rest show linear increment when having more 
input B-scans. Fig. 5 shows the graphs for computational time of different 
methods for different number of input images. Fig. 6 displays close-ups of 
the original and filtered version of the input images using different techniques 
for comparison. 

5. Conclusion 

In this paper, a new application of ICA techniques for noise reduction 
of retinal OCT images is proposed. Having a set of B-scans from the same 
location (to some extent) in retina, the eye movement is compensated us¬ 
ing a parametric registration technique for compensating translation and 
rotation between consecutive B-scans. Then, taking advantage of four ICA 
techniques, RUNICA, JADE, FastICA and SOBI, the aggregated dataset 
is analyzed and the noise free image is extracted. Having N B-scans with 
uncorrelated speckle, the expected improvement in SNR for average/median 
filtering is V7V. While this is satisfied for up to 20 input B-scans, with ICA 
techniques outperforming the median filtering, having more input B-scans 
results in poor performance in all of the ICA methods, except for SOBI. 
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Figure 3: CNR improvement for different number of input images (5-50). 



Figure 4: ENL improvement for different number of input images (5-50). 
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Figure 5: Comparison of computational times. 


As for the comparison of the computational time, JADE has a quadratic 
behavior, while the rest show linear increment in computational time when 
having more images. Overall, SOBI is the best among the ICA techniques 
considered here in terms of performance based on SNR, CNR and ENL, while 
needing less computational power. As for pointers towards next possible ar¬ 
eas for research, newer techniques for ICA/BSS as well as more analysis on 
the speckle noise model in OCT images should be explored and investigated, 
while reducing the computational complexity and exploring the possibility of 
using the techniques in an on-line manner can be considered as next steps. 

6. Acknowledgment 

The authors would like to thank Dr. Joseph Carroll from Advanced 
Ocular Imaging Program (AOIP), Medical College of Wisconsin (MCW), 
Milwaukee, WI for providing the data and insight. This work was partially 
supported by NIH P30EY001931. Support received by grant 8UL1TR000055 
from the Clinical and Translational Science Award (CTSA) program of the 
National Center for Research Resources and the National Center for Ad¬ 
vancing Translational Sciences. Support received by the Clinical and Trans¬ 
lational Science Institute of Southeast Wisconsin through the Advancing a 


12 











(a) (b) (c) (d) (e) 

Figure 6: A portion of one of the input images (a) and the results of InfoMax 
(b), JADE (c), FastICA (d) and SOB1 (e), resulted from 50 B-scans. 
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